Building “Stuff”?

Charles F. Vardeman II

Center for Research Computing, University of Notre Dame

2023-09-05

Building Stuff?

Building Agents based on Large Language Models!

Building “Stuff”…

Building “Stuff”…

But wait, there’s more ways to “program” a Large Language Mode!

Autoregressive Large Language Model

“An autoregressive large language model (AR-LLM) is a type of neural network model that can generate natural language text. It has a very large number of parameters (billions or trillions) that are trained on a huge amount of text data from various sources. The main goal of an AR-LLM is to predict the next word or token based on the previous words or tokens in the input text. For example, if the input text is”The sky is”, the AR-LLM might predict “blue” as the next word. AR-LLMs can also generate text from scratch by sampling words from a probability distribution. For example, if the input text is empty, the AR-LLM might generate “Once upon a time, there was a princess who lived in a castle.” as the output text.”1

AR-LLMs are “Turing Machines”

You are here ⬇️

What Kind of LLM Agents are we trying to build?

  • Conversational Agents
  • vs. Cognitive Autonomous Agents
  • vs. Agents tuned for a Data Processing Task

We will focus on Conversational Agents…

The Best Advice we can Give

Caveat: You are at the Edge of Research and Practice!

Prompt Engineering

“Prompt engineering is the process of designing and refining the prompts or input stimuli for a language model to generate specific types of output. Prompt engineering involves selecting appropriate keywords, providing context, and shaping the input in a way that encourages the model to produce the desired response and is a vital technique to actively shape the behavior and output of foundation models.”1

(GPT-3) Instruct-GPT Reinforcement Learning from Human Feedback

Facilitates Conversational Agents to “Converse” in a Set Style!

(GPT-3) Large Language Models are Zero Shot Reasoners (Chain-of-Thought Reasoning)

Tools for “Prompt Engineering”

Tools for “Prompt Engineering”

LLMs as Reasoners using Prompts!

Prompt Engineering Guide

We want Large Language Models to be Factual!

  • “Fine Tuning”
  • Retrieval Augmented Generation (RAG)

We want Large Language Models to be Factual!

  • Fine-Tuning: augment the behavior of the model
  • Retrieval: introduce new knowledge to the model
  • Retreval Aware Training (RAT) Fine-tune the model to use or ignore retrieved content

Retrieval Augmented Generation (RAG)

“Foundation models are usually trained offline, making the model agnostic to any data that is created after the model was trained. Additionally, foundation models are trained on very general domain corpora, making them less effective for domain-specific tasks. You can use Retrieval Augmented Generation (RAG) to retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context. For more information about RAG model architectures”^[“Retrieval Augmented Generation (RAG) - Amazon SageMaker.” Accessed September 4, 2023. http://tiny.cc/f3mavz.

Retrieval Augmented Generation (RAG)

LlamaIndex to Build Hybrid KGs

Gorilla: Retrieval Aware Training for APIs

Gorilla: Retrieval Aware Training for APIs

“Big Models” vs “Small Models”

  • Models as a service (Bedrock, OpenAI API, Anthropic Claude)
    • Generally more difficult to “Fine-Tune” (GPT-3.5 turbo)1
    • Models are generally more capable (Factuality, Instructions, Reasoning)]
    • “Coin-operated” pay per/token
  • “Open License” 7B-70B Parameter Models
    • Mostly based on Meta AI LLama or LLama 2 models
    • Require more effort to work consistently
    • Models can run on reduced hardware requirements
    • Can be fine-tuned for task specific workflows

Small Models with custom grammar (llama.cpp)

JSON-Grammar

root   ::= object
value  ::= object | array | string | number | ("true" | "false" | "null") ws

object ::=
  "{" ws (
            string ":" ws value
    ("," ws string ":" ws value)*
  )? "}" ws

array  ::=
  "[" ws (
            value
    ("," ws value)*
  )? "]" ws

string ::=
  "\"" (
    [^"\\] |
    "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
  )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= ([ \t\n] ws)?

“The state of GPT” Recommendations

Open Source Community

Today is the First Step on your Journey to Building LLM Based Applications!…